-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
Hey @samskalicky , Thanks for submitting the PR
CI supported jobs: [windows-cpu, unix-gpu, centos-cpu, edge, windows-gpu, unix-cpu, clang, miscellaneous, centos-gpu, website, sanity] Note: |
Manual build on Mac runs the gemm_lib example in extensions without segfault |
include/mxnet/c_api.h
Outdated
* \return 0 when success, -1 when failure happens. | ||
*/ | ||
MXNET_DLL int MXLoadLib(const char *path, unsigned verbose); | ||
MXNET_DLL int MXLoadLib(const char *path, unsigned verbose, void** lib); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a C API change. Is it required for the fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @szha! No this change isnt required, but without it we cannot return a handle to the opened library. Its not required per-se since opening the same library using ctypes or some other API will return the same handle anyway (with linux doing the mapping). So we can remove this change if that is important, either way is fine to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C API is part of the API to maintain semantic versioning for. Let's not include the API change then.
Thanks for the PR. The destructor change looks good. What's the benefit of returning handle to opened library, if it's not recommended that users close them? Is there any other use case? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
stop closing opened libs. fixes #20411 using portions already merged into master/2.0 in #19016.
According to [1]
So calling
dlclose
is just a suggestion to the OS that it could close the library. But the OS may not do anything (ie. and wait until the process exits to clean up any opened handles). Im thinking we might as well just let it be cleaned up anyway rather than callingdlclose
at all.As suggested in [2]
So if a user was loading/unloading a bunch of libraries at runtime they might want to close some they dont need anymore.
However, since MXNet registers components from the library (ie. operators, partitioners, graph passes, etc) and has NO capability to unregister them the references to the loaded library will live at least as long as MXNet (ie. libmxnet.so) does. MXNet already has quite a few components (or its dependencies like TVM/NNVM) that live forever and require the OS to clean them up when the process exits. So its not possible to close libmxnet.so and assume everything is cleaned up. We might as well do the same thing for custom libraries.
This PR removes the code to close opened libraries in the destructor of the
LibraryInitializer
class with the caveat that MXNet will still have pointers to that library, so closing would not be recommended until process exit anyway.